Evaluating Speech-Driven IR in the NTCIR-3 Web Retrieval Task
نویسندگان
چکیده
Speech recognition has of late become a practical technology for real world applications. For the purpose of research and development in speech-driven retrieval, which facilitates retrieving information with spoken queries, we organized the speech-driven retrieval subtask in the NTCIR-3 Web retrieval task. Search topics for the Web retrieval main task were dictated by ten speakers and recorded as collections of spoken queries. We used those queries to evaluate the performance of our speech-driven retrieval system, where speech recognition and text retrieval modules were integrated. The text retrieval module, which is based on a probabilistic model, indexed only textual contents in documents (Web pages), but did not use HTML tags and hyperlink information in documents. Experimental results showed that a) the use of target documents for language modeling and b) enhancement of the vocabulary size in speech recognition were effective to improve the system performance.
منابع مشابه
Evaluating multiple LVCSR model combination in NTCIR-3 speech-driven web retrieval task
This paper studies speech-driven Web retrieval models which accepts spoken search topics (queries) in the NTCIR-3 Web retrieval task. The major focus of this paper is on improving speech recognition accuracy of spoken queries and then improving retrieval accuracy in speech-driven Web retrieval. We experimentally evaluate the techniques of combining outputs of multiple LVCSR models in recognitio...
متن کاملExperiments on Web Retrieval Driven by Spontaneously Spoken Queries
Motivated to realize the speech-driven information retrieval systems that accept spontaneously spoken queries, we developed a method to collect such speech data derived from the pre-defined search topics that had been systematically constructed for IR research. In order to evaluate both our method and the performance of the document retrieval by using the spontaneously spoken queries, we took p...
متن کاملEvaluating Speech-Driven Web Retrieval in the Third NTCIR Workshop
Speech recognition has of late become a practical technology for real world applications. For the purpose of research and development in speech-driven retrieval, which facilitates retrieving information with spoken queries, we organized the speech-driven retrieval subtask in the NTCIR-3 Web retrieval task. Search topics for the Web retrieval main task were dictated by ten speakers and were reco...
متن کاملCollecting Spontaneously Spoken Queries for Information Retrieval
Motivated to realize the speech-driven information retrieval systems that accept spontaneously spoken queries, we developed a method to collect such speech data derived from the pre-defined search topics that had been systematically constructed for IR research. In order to evaluate both our method and the performance of the document retrieval by using the spontaneously spoken queries, we took p...
متن کاملKeyword recognition and extraction by multiple-LVCSRs with 60, 000 words in speech-driven WEB retrieval task
This paper presents speech-driven Web retrieval models which accepts spoken search topics (queries) in the NTCIR-3 Web retrieval task. We experimentally evaluate the techniques of combining outputs of multiple LVCSR models with a language model(LM) with a 60,000 vocabulary size in recognition of spoken queries. As model combination techniques, we use the SVM learning. We show that the technique...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2002